Precise Estimation of Vocal Tract and Voice Source Characteristics

نویسندگان

  • Yoshinori Shiga
  • Steve Isard
  • Steve Renals
  • Hiroshi Shimodaira
چکیده

This thesis addresses the problem of quality degradation in speech produced by parameter-based speech synthesis, within the framework of an articulatory-acoustic forward mapping. I first investigate current problems in speech parameterisation, and point out the fact that conventional parameterisation inaccurately extracts the vocal tract response due to interference from the harmonic structure of voiced speech. To overcome this problem, I introduce a method for estimating filter responses more precisely from periodic signals. The method achieves such estimation in the frequency domain by approximating all the harmonics observed in several frames based on a least squares criterion. It is shown that the proposed method is capable of estimating the response more accurately than widely-used frame-by-frame parameterisation, for simulations using synthetic speech and for an articulatory-acoustic mapping using actual speech. I also deal with the source-filter separation problem and independent control of the voice source characteristic during speech synthesis. I propose a statistical approach to separating out the vocal-tract filter response from the voice source characteristic using a large articulatory database. The approach realises such separation for voiced speech using an iterative approximation procedure under the assumption that the speech production process is a linear system composed of a voice source and a vocal-tract filter, and that each of the components is controlled independently by different sets of factors. Experimental results show that controlling the source characteristic greatly improves the accuracy of the articulatory-acoustic mapping, and that the spectral variation of the source characteristic is evidently influenced by the fundamental frequency or the power of speech. The thesis provides more accurate acoustical approximation of the vocal tract response, which will be beneficial in a wide range of speech technologies, and lays the groundwork in speech science for a new type of corpus-based statistical solution to the source-filter separation problem.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Effects of Voice Therapy on Vocal Tract Discomfort in Muscle Tension Dysphonia

Introduction: Patients with muscle tension dysphonia (MTD) suffer from several physical discomforts in their vocal tract. However, few studies have examined the effects of voice therapy (VT) on the vocal tract discomfort (VTD) in patients with voice disorders. Therefore, the aim of the present study was to investigate the effects of VT on the VTD in patients with MTD.   Materi...

متن کامل

The Vocal Tract in Singing

Precise control of the vocal tract configuration is of critical importance for producing the desired acoustic characteristics of singing. The pattern of acoustic resonances generated by a given vocal tract shape influences vowel identity, voice quality (timbre), and, to some degree, the spectral characteristics of the voice excitation source itself. This chapter is broadly focused on how the vo...

متن کامل

Why so Different? Aspects of Voice Characteristics in Operatic and Musical Theatre Singing

This thesis addresses aspects of voice characteristics in operatic and musical theatre singing. The common aim of the studies was to identify respiratory, phonatory and resonatory characteristics accounting for salient voice timbre differences between singing styles. The velopharyngeal opening (VPO) was analyzed in professional operatic singers, using nasofiberscopy. Differing shapes of VPOs su...

متن کامل

Estimation of voice source and vocal tract characteristics based on multi-frame analysis

This paper presents a new approach for estimating voice source and vocal tract filter characteristics of voiced speech. When it is required to know the transfer function of a system in signal processing, the input and output of the system are experimentally observed and used to calculate the function. However, in the case of source-filter separation we deal with in this paper, only the output (...

متن کامل

Accuracy evaluation of esophageal voice analysis based on automatic topology generated-voicing source HMM

An Auto-Regressive eXogenous (ARX) model combined with descriptive models of the glottal source waveform has been adopted to more accurately separate the vocal tract and the voicing source. However, these methods cannot be easily applied to the analysis of voices uttered by different speech production methods, such as esophageal voice. We previously proposed the Voicing Source Hidden Markov Mod...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005